Overview

Dataset statistics

Number of variables22
Number of observations1647
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory283.2 KiB
Average record size in memory176.1 B

Variable types

Numeric11
Categorical11

Alerts

agency has constant value "DPR" Constant
agency_name has constant value "Department_of_Parks_and_Recreation" Constant
state has constant value "NY" Constant
Created_Date has a high cardinality: 461 distinct values High cardinality
Closed_Date has a high cardinality: 484 distinct values High cardinality
unique_key is highly correlated with Created_Year and 1 other fieldsHigh correlation
incident_zip is highly correlated with longitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
longitude is highly correlated with incident_zip and 1 other fieldsHigh correlation
Created_Year is highly correlated with unique_key and 2 other fieldsHigh correlation
Created_Month_Number is highly correlated with Created_YearHigh correlation
Closed_Year is highly correlated with unique_key and 1 other fieldsHigh correlation
unique_key is highly correlated with Created_Year and 1 other fieldsHigh correlation
incident_zip is highly correlated with longitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
longitude is highly correlated with incident_zip and 1 other fieldsHigh correlation
Created_Year is highly correlated with unique_key and 1 other fieldsHigh correlation
Closed_Year is highly correlated with unique_key and 1 other fieldsHigh correlation
unique_key is highly correlated with Created_Year and 1 other fieldsHigh correlation
Created_Year is highly correlated with unique_key and 2 other fieldsHigh correlation
Created_Month_Number is highly correlated with Created_YearHigh correlation
Closed_Year is highly correlated with unique_key and 1 other fieldsHigh correlation
state is highly correlated with status and 7 other fieldsHigh correlation
status is highly correlated with state and 2 other fieldsHigh correlation
agency_name is highly correlated with state and 7 other fieldsHigh correlation
Created_Month is highly correlated with state and 3 other fieldsHigh correlation
Closed_Month is highly correlated with state and 2 other fieldsHigh correlation
borough is highly correlated with state and 3 other fieldsHigh correlation
city is highly correlated with state and 3 other fieldsHigh correlation
complaint_type is highly correlated with state and 3 other fieldsHigh correlation
agency is highly correlated with state and 7 other fieldsHigh correlation
CID is highly correlated with unique_key and 7 other fieldsHigh correlation
unique_key is highly correlated with CID and 7 other fieldsHigh correlation
complaint_type is highly correlated with CID and 5 other fieldsHigh correlation
city is highly correlated with borough and 3 other fieldsHigh correlation
borough is highly correlated with city and 3 other fieldsHigh correlation
incident_zip is highly correlated with city and 3 other fieldsHigh correlation
latitude is highly correlated with city and 3 other fieldsHigh correlation
longitude is highly correlated with city and 3 other fieldsHigh correlation
Created_Year is highly correlated with CID and 5 other fieldsHigh correlation
Created_Month is highly correlated with CID and 7 other fieldsHigh correlation
Created_Month_Number is highly correlated with CID and 7 other fieldsHigh correlation
Created_Day is highly correlated with CID and 1 other fieldsHigh correlation
Closed_Year is highly correlated with unique_key and 4 other fieldsHigh correlation
Closed_Month is highly correlated with CID and 7 other fieldsHigh correlation
Closed_Month_Number is highly correlated with CID and 6 other fieldsHigh correlation
CID is uniformly distributed Uniform
CID has unique values Unique
unique_key has unique values Unique

Reproduction

Analysis started2021-12-03 01:41:10.147699
Analysis finished2021-12-03 01:41:28.724369
Duration18.58 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

CID
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct1647
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean823
Minimum0
Maximum1646
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:28.808399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile82.3
Q1411.5
median823
Q31234.5
95-th percentile1563.7
Maximum1646
Range1646
Interquartile range (IQR)823

Descriptive statistics

Standard deviation475.5922623
Coefficient of variation (CV)0.5778763819
Kurtosis-1.2
Mean823
Median Absolute Deviation (MAD)412
Skewness0
Sum1355481
Variance226188
MonotonicityStrictly increasing
2021-12-02T20:41:28.919424image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
5411
 
0.1%
5611
 
0.1%
5591
 
0.1%
5571
 
0.1%
5551
 
0.1%
5531
 
0.1%
5511
 
0.1%
5491
 
0.1%
5471
 
0.1%
Other values (1637)1637
99.4%
ValueCountFrequency (%)
01
0.1%
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
ValueCountFrequency (%)
16461
0.1%
16451
0.1%
16441
0.1%
16431
0.1%
16421
0.1%
16411
0.1%
16401
0.1%
16391
0.1%
16381
0.1%
16371
0.1%

unique_key
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct1647
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39508745.86
Minimum17573251
Maximum52237113
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:29.042452image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum17573251
5-th percentile17593857.7
Q140328785
median43190556
Q345725472
95-th percentile46240173.8
Maximum52237113
Range34663862
Interquartile range (IQR)5396687

Descriptive statistics

Standard deviation10065780.58
Coefficient of variation (CV)0.2547734776
Kurtosis0.6697275797
Mean39508745.86
Median Absolute Deviation (MAD)2571312
Skewness-1.553084839
Sum6.507090443 × 1010
Variance1.013199387 × 1014
MonotonicityNot monotonic
2021-12-02T20:41:29.157467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
427663381
 
0.1%
398362661
 
0.1%
452904911
 
0.1%
426502421
 
0.1%
428001541
 
0.1%
434781091
 
0.1%
365533411
 
0.1%
453740741
 
0.1%
404711601
 
0.1%
398403741
 
0.1%
Other values (1637)1637
99.4%
ValueCountFrequency (%)
175732511
0.1%
175742161
0.1%
175743961
0.1%
175745401
0.1%
175762031
0.1%
175763021
0.1%
175763441
0.1%
175763501
0.1%
175764141
0.1%
175766881
0.1%
ValueCountFrequency (%)
522371131
0.1%
521911841
0.1%
521370351
0.1%
519933891
0.1%
517177541
0.1%
516617971
0.1%
515888721
0.1%
511314391
0.1%
510911521
0.1%
509813481
0.1%

agency
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
DPR
1647 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDPR
2nd rowDPR
3rd rowDPR
4th rowDPR
5th rowDPR

Common Values

ValueCountFrequency (%)
DPR1647
100.0%

Length

2021-12-02T20:41:29.280511image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T20:41:29.345526image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
dpr1647
100.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

agency_name
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
Department_of_Parks_and_Recreation
1647 

Length

Max length34
Median length34
Mean length34
Min length34

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDepartment_of_Parks_and_Recreation
2nd rowDepartment_of_Parks_and_Recreation
3rd rowDepartment_of_Parks_and_Recreation
4th rowDepartment_of_Parks_and_Recreation
5th rowDepartment_of_Parks_and_Recreation

Common Values

ValueCountFrequency (%)
Department_of_Parks_and_Recreation1647
100.0%

Length

2021-12-02T20:41:29.409540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T20:41:29.475550image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
department_of_parks_and_recreation1647
100.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

complaint_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
Dead/Dying_Tree
598 
Illegal_Tree_Damage
572 
Damaged_Tree
477 

Length

Max length19
Median length15
Mean length15.52034001
Min length12

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDamaged_Tree
2nd rowDamaged_Tree
3rd rowDamaged_Tree
4th rowDamaged_Tree
5th rowDamaged_Tree

Common Values

ValueCountFrequency (%)
Dead/Dying_Tree598
36.3%
Illegal_Tree_Damage572
34.7%
Damaged_Tree477
29.0%

Length

2021-12-02T20:41:29.546566image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T20:41:29.620570image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
dead/dying_tree598
36.3%
illegal_tree_damage572
34.7%
damaged_tree477
29.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

status
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
Closed
1641 
In Progress
 
6

Length

Max length11
Median length6
Mean length6.018214936
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowClosed
2nd rowClosed
3rd rowClosed
4th rowClosed
5th rowClosed

Common Values

ValueCountFrequency (%)
Closed1641
99.6%
In Progress6
 
0.4%

Length

2021-12-02T20:41:29.709590image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T20:41:29.778605image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
closed1641
99.3%
in6
 
0.4%
progress6
 
0.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

city
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct44
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
Brooklyn
535 
Staten_Island
180 
New_York
165 
Bronx
114 
Flushing
66 
Other values (39)
587 

Length

Max length19
Median length8
Mean length9.305403764
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRidgewood
2nd rowBrooklyn
3rd rowBayside
4th rowBrooklyn
5th rowBellerose

Common Values

ValueCountFrequency (%)
Brooklyn535
32.5%
Staten_Island180
 
10.9%
New_York165
 
10.0%
Bronx114
 
6.9%
Flushing66
 
4.0%
Jamaica54
 
3.3%
Astoria44
 
2.7%
Whitestone43
 
2.6%
Ridgewood29
 
1.8%
Fresh_Meadows28
 
1.7%
Other values (34)389
23.6%

Length

2021-12-02T20:41:29.862624image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
brooklyn535
32.5%
staten_island180
 
10.9%
new_york165
 
10.0%
bronx114
 
6.9%
flushing66
 
4.0%
jamaica54
 
3.3%
astoria44
 
2.7%
whitestone43
 
2.6%
ridgewood29
 
1.8%
fresh_meadows28
 
1.7%
Other values (34)389
23.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

borough
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
Queens
653 
Brooklyn
535 
Staten_Island
180 
Manhattan
165 
Bronx
114 

Length

Max length13
Median length8
Mean length7.646023072
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowQueens
2nd rowBrooklyn
3rd rowQueens
4th rowBrooklyn
5th rowQueens

Common Values

ValueCountFrequency (%)
Queens653
39.6%
Brooklyn535
32.5%
Staten_Island180
 
10.9%
Manhattan165
 
10.0%
Bronx114
 
6.9%

Length

2021-12-02T20:41:29.982651image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T20:41:30.064677image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
queens653
39.6%
brooklyn535
32.5%
staten_island180
 
10.9%
manhattan165
 
10.0%
bronx114
 
6.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

incident_zip
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct162
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11002.62356
Minimum10001
Maximum11694
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:30.174695image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10001
5-th percentile10022
Q110466
median11221
Q311364
95-th percentile11427
Maximum11694
Range1693
Interquartile range (IQR)898

Descriptive statistics

Standard deviation488.7673827
Coefficient of variation (CV)0.04442280336
Kurtosis-0.6669087162
Mean11002.62356
Median Absolute Deviation (MAD)151
Skewness-0.9873939765
Sum18121321
Variance238893.5544
MonotonicityNot monotonic
2021-12-02T20:41:30.400759image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1135743
 
2.6%
1123441
 
2.5%
1031237
 
2.2%
1031435
 
2.1%
1123035
 
2.1%
1121531
 
1.9%
1120929
 
1.8%
1138529
 
1.8%
1030623
 
1.4%
1135822
 
1.3%
Other values (152)1322
80.3%
ValueCountFrequency (%)
100015
0.3%
1000211
0.7%
100035
0.3%
100096
0.4%
1001111
0.7%
100125
0.3%
100133
 
0.2%
100148
0.5%
100165
0.3%
100171
 
0.1%
ValueCountFrequency (%)
116947
0.4%
116915
 
0.3%
114368
0.5%
114356
 
0.4%
1143414
0.9%
1143310
0.6%
1143216
1.0%
1142910
0.6%
114285
 
0.3%
114279
0.5%

latitude
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1571
Distinct (%)95.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.69924573
Minimum40.50190414
Maximum40.91107395
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:30.527782image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum40.50190414
5-th percentile40.56550636
Q140.6336399
median40.69852462
Q340.75537135
95-th percentile40.85004618
Maximum40.91107395
Range0.40916981
Interquartile range (IQR)0.121731445

Descriptive statistics

Standard deviation0.08152149339
Coefficient of variation (CV)0.002003022216
Kurtosis-0.2659598809
Mean40.69924573
Median Absolute Deviation (MAD)0.06070443
Skewness0.1464206562
Sum67031.65772
Variance0.006645753885
MonotonicityNot monotonic
2021-12-02T20:41:30.643819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.885629746
 
0.4%
40.755004753
 
0.2%
40.6282613
 
0.2%
40.773317213
 
0.2%
40.685242213
 
0.2%
40.767904613
 
0.2%
40.717406993
 
0.2%
40.52921043
 
0.2%
40.85519722
 
0.1%
40.716147952
 
0.1%
Other values (1561)1616
98.1%
ValueCountFrequency (%)
40.501904141
0.1%
40.505267522
0.1%
40.507096981
0.1%
40.507411881
0.1%
40.509271911
0.1%
40.513938341
0.1%
40.514599241
0.1%
40.527232771
0.1%
40.527850731
0.1%
40.528163891
0.1%
ValueCountFrequency (%)
40.911073951
0.1%
40.90863161
0.1%
40.90749331
0.1%
40.905254181
0.1%
40.901906841
0.1%
40.898704851
0.1%
40.897839471
0.1%
40.89759241
0.1%
40.89715041
0.1%
40.895132391
0.1%

longitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1571
Distinct (%)95.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.92060394
Minimum-74.25319469
Maximum-73.70143666
Zeros0
Zeros (%)0.0%
Negative1647
Negative (%)100.0%
Memory size13.0 KiB
2021-12-02T20:41:30.764829image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-74.25319469
5-th percentile-74.15105504
Q1-73.98335184
median-73.9249353
Q3-73.82995363
95-th percentile-73.74141335
Maximum-73.70143666
Range0.55175803
Interquartile range (IQR)0.15339821

Descriptive statistics

Standard deviation0.1143565406
Coefficient of variation (CV)-0.001547018483
Kurtosis0.01330237305
Mean-73.92060394
Median Absolute Deviation (MAD)0.06919525
Skewness-0.4191522725
Sum-121747.2347
Variance0.01307741837
MonotonicityNot monotonic
2021-12-02T20:41:30.870870image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.866054826
 
0.4%
-73.824146883
 
0.2%
-74.190883923
 
0.2%
-74.027653373
 
0.2%
-73.966858213
 
0.2%
-73.748874543
 
0.2%
-73.957144283
 
0.2%
-73.984636313
 
0.2%
-74.253194692
 
0.1%
-74.104531332
 
0.1%
Other values (1561)1616
98.1%
ValueCountFrequency (%)
-74.253194692
0.1%
-74.250839971
0.1%
-74.249135641
0.1%
-74.24734241
0.1%
-74.246243241
0.1%
-74.245939991
0.1%
-74.2430821
0.1%
-74.221197881
0.1%
-74.221060381
0.1%
-74.220348321
0.1%
ValueCountFrequency (%)
-73.701436661
0.1%
-73.702339531
0.1%
-73.702377021
0.1%
-73.702686961
0.1%
-73.702938871
0.1%
-73.702960261
0.1%
-73.703531441
0.1%
-73.703877751
0.1%
-73.704017471
0.1%
-73.704095871
0.1%

Created_Year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.686096
Minimum2010
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:30.973894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12018
median2019
Q32020
95-th percentile2020
Maximum2021
Range11
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.590414503
Coefficient of variation (CV)0.001779471301
Kurtosis0.7296685896
Mean2017.686096
Median Absolute Deviation (MAD)1
Skewness-1.58866897
Sum3323129
Variance12.8910763
MonotonicityNot monotonic
2021-12-02T20:41:31.055894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2019624
37.9%
2020576
35.0%
2010286
17.4%
2018108
 
6.6%
201741
 
2.5%
202112
 
0.7%
ValueCountFrequency (%)
2010286
17.4%
201741
 
2.5%
2018108
 
6.6%
2019624
37.9%
2020576
35.0%
202112
 
0.7%
ValueCountFrequency (%)
202112
 
0.7%
2020576
35.0%
2019624
37.9%
2018108
 
6.6%
201741
 
2.5%
2010286
17.4%

Created_Month
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
May
432 
July
360 
June
160 
April
117 
January
100 
Other values (7)
478 

Length

Max length9
Median length4
Mean length4.889496053
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDecember
2nd rowJuly
3rd rowJuly
4th rowJuly
5th rowDecember

Common Values

ValueCountFrequency (%)
May432
26.2%
July360
21.9%
June160
 
9.7%
April117
 
7.1%
January100
 
6.1%
December96
 
5.8%
March94
 
5.7%
February89
 
5.4%
August79
 
4.8%
September59
 
3.6%
Other values (2)61
 
3.7%

Length

2021-12-02T20:41:31.160918image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
may432
26.2%
july360
21.9%
june160
 
9.7%
april117
 
7.1%
january100
 
6.1%
december96
 
5.8%
march94
 
5.7%
february89
 
5.4%
august79
 
4.8%
september59
 
3.6%
Other values (2)61
 
3.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Created_Month_Number
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.830601093
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:31.251956image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median5
Q37
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.626130226
Coefficient of variation (CV)0.4504047155
Kurtosis0.2145281585
Mean5.830601093
Median Absolute Deviation (MAD)2
Skewness0.3865640598
Sum9603
Variance6.896559967
MonotonicityNot monotonic
2021-12-02T20:41:31.330974image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
5432
26.2%
7360
21.9%
6160
 
9.7%
4117
 
7.1%
1100
 
6.1%
1296
 
5.8%
394
 
5.7%
289
 
5.4%
879
 
4.8%
959
 
3.6%
Other values (2)61
 
3.7%
ValueCountFrequency (%)
1100
 
6.1%
289
 
5.4%
394
 
5.7%
4117
 
7.1%
5432
26.2%
6160
 
9.7%
7360
21.9%
879
 
4.8%
959
 
3.6%
1051
 
3.1%
ValueCountFrequency (%)
1296
 
5.8%
1110
 
0.6%
1051
 
3.1%
959
 
3.6%
879
 
4.8%
7360
21.9%
6160
 
9.7%
5432
26.2%
4117
 
7.1%
394
 
5.7%

Created_Day
Real number (ℝ≥0)

HIGH CORRELATION

Distinct31
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.81542198
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:31.422995image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q19
median15
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.359695017
Coefficient of variation (CV)0.5285786891
Kurtosis-1.064411667
Mean15.81542198
Median Absolute Deviation (MAD)7
Skewness0.0779890187
Sum26048
Variance69.88450078
MonotonicityNot monotonic
2021-12-02T20:41:31.518017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
13132
 
8.0%
676
 
4.6%
2074
 
4.5%
1166
 
4.0%
1265
 
3.9%
2258
 
3.5%
3057
 
3.5%
957
 
3.5%
2657
 
3.5%
2156
 
3.4%
Other values (21)949
57.6%
ValueCountFrequency (%)
143
2.6%
234
2.1%
340
2.4%
442
2.6%
539
2.4%
676
4.6%
750
3.0%
855
3.3%
957
3.5%
1049
3.0%
ValueCountFrequency (%)
3127
1.6%
3057
3.5%
2948
2.9%
2850
3.0%
2740
2.4%
2657
3.5%
2549
3.0%
2449
3.0%
2337
2.2%
2258
3.5%

Created_Date
Categorical

HIGH CARDINALITY

Distinct461
Distinct (%)28.0%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
2010-07-13
 
82
2010-07-06
 
54
2010-07-09
 
23
2020-05-20
 
22
2019-12-30
 
21
Other values (456)
1445 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique195 ?
Unique (%)11.8%

Sample

1st row2019-12-25
2nd row2010-07-03
3rd row2019-07-05
4th row2010-07-03
5th row2019-12-26

Common Values

ValueCountFrequency (%)
2010-07-1382
 
5.0%
2010-07-0654
 
3.3%
2010-07-0923
 
1.4%
2020-05-2022
 
1.3%
2019-12-3021
 
1.3%
2020-05-1220
 
1.2%
2020-05-1518
 
1.1%
2020-05-1918
 
1.1%
2010-07-0417
 
1.0%
2020-05-1117
 
1.0%
Other values (451)1355
82.3%

Length

2021-12-02T20:41:31.631025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2010-07-1382
 
5.0%
2010-07-0654
 
3.3%
2010-07-0923
 
1.4%
2020-05-2022
 
1.3%
2019-12-3021
 
1.3%
2020-05-1220
 
1.2%
2020-05-1518
 
1.1%
2020-05-1918
 
1.1%
2010-07-0417
 
1.0%
2020-05-1117
 
1.0%
Other values (451)1355
82.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Closed_Year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.986035
Minimum2010
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:31.723058image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12019
median2020
Q32020
95-th percentile2020
Maximum2021
Range11
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.618895229
Coefficient of variation (CV)0.001793320254
Kurtosis0.867126401
Mean2017.986035
Median Absolute Deviation (MAD)1
Skewness-1.632307395
Sum3323623
Variance13.09640268
MonotonicityNot monotonic
2021-12-02T20:41:31.808077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2020759
46.1%
2019494
30.0%
2010246
 
14.9%
202171
 
4.3%
201729
 
1.8%
201123
 
1.4%
20129
 
0.5%
20188
 
0.5%
20134
 
0.2%
20144
 
0.2%
ValueCountFrequency (%)
2010246
 
14.9%
201123
 
1.4%
20129
 
0.5%
20134
 
0.2%
20144
 
0.2%
201729
 
1.8%
20188
 
0.5%
2019494
30.0%
2020759
46.1%
202171
 
4.3%
ValueCountFrequency (%)
202171
 
4.3%
2020759
46.1%
2019494
30.0%
20188
 
0.5%
201729
 
1.8%
20144
 
0.2%
20134
 
0.2%
20129
 
0.5%
201123
 
1.4%
2010246
 
14.9%

Closed_Month
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
July
305 
June
183 
May
165 
March
151 
December
131 
Other values (7)
712 

Length

Max length9
Median length5
Mean length5.618093503
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJanuary
2nd rowJuly
3rd rowJuly
4th rowJuly
5th rowMarch

Common Values

ValueCountFrequency (%)
July305
18.5%
June183
11.1%
May165
10.0%
March151
9.2%
December131
8.0%
April122
 
7.4%
January115
 
7.0%
August112
 
6.8%
September107
 
6.5%
October95
 
5.8%
Other values (2)161
9.8%

Length

2021-12-02T20:41:31.915102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
july305
18.5%
june183
11.1%
may165
10.0%
march151
9.2%
december131
8.0%
april122
 
7.4%
january115
 
7.0%
august112
 
6.8%
september107
 
6.5%
october95
 
5.8%
Other values (2)161
9.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Closed_Month_Number
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.321190043
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:32.006122image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median6
Q38
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.102060075
Coefficient of variation (CV)0.4907398851
Kurtosis-0.770053615
Mean6.321190043
Median Absolute Deviation (MAD)2
Skewness0.1166877643
Sum10411
Variance9.622776712
MonotonicityNot monotonic
2021-12-02T20:41:32.088140image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
7305
18.5%
6183
11.1%
5165
10.0%
3151
9.2%
12131
8.0%
4122
 
7.4%
1115
 
7.0%
8112
 
6.8%
9107
 
6.5%
295
 
5.8%
Other values (2)161
9.8%
ValueCountFrequency (%)
1115
 
7.0%
295
 
5.8%
3151
9.2%
4122
 
7.4%
5165
10.0%
6183
11.1%
7305
18.5%
8112
 
6.8%
9107
 
6.5%
1095
 
5.8%
ValueCountFrequency (%)
12131
8.0%
1166
 
4.0%
1095
 
5.8%
9107
 
6.5%
8112
 
6.8%
7305
18.5%
6183
11.1%
5165
10.0%
4122
 
7.4%
3151
9.2%

Closed_Day
Real number (ℝ≥0)

Distinct31
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.80327869
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size13.0 KiB
2021-12-02T20:41:32.186150image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.430412107
Coefficient of variation (CV)0.5334596873
Kurtosis-1.089960228
Mean15.80327869
Median Absolute Deviation (MAD)7
Skewness-0.03965529037
Sum26028
Variance71.0718483
MonotonicityNot monotonic
2021-12-02T20:41:32.279184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
2691
 
5.5%
2387
 
5.3%
1278
 
4.7%
1376
 
4.6%
2071
 
4.3%
1966
 
4.0%
1462
 
3.8%
1660
 
3.6%
1759
 
3.6%
658
 
3.5%
Other values (21)939
57.0%
ValueCountFrequency (%)
149
3.0%
251
3.1%
357
3.5%
445
2.7%
535
2.1%
658
3.5%
739
2.4%
857
3.5%
948
2.9%
1054
3.3%
ValueCountFrequency (%)
3128
 
1.7%
3042
2.6%
2932
 
1.9%
2838
2.3%
2750
3.0%
2691
5.5%
2532
 
1.9%
2445
2.7%
2387
5.3%
2253
3.2%

Closed_Date
Categorical

HIGH CARDINALITY

Distinct484
Distinct (%)29.4%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
2020-12-23
 
40
2020-03-26
 
27
2010-07-13
 
21
2010-07-14
 
19
2019-06-12
 
18
Other values (479)
1522 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186 ?
Unique (%)11.3%

Sample

1st row2020-01-03
2nd row2010-07-06
3rd row2019-07-09
4th row2010-07-22
5th row2020-03-31

Common Values

ValueCountFrequency (%)
2020-12-2340
 
2.4%
2020-03-2627
 
1.6%
2010-07-1321
 
1.3%
2010-07-1419
 
1.2%
2019-06-1218
 
1.1%
2020-05-2117
 
1.0%
2020-05-2615
 
0.9%
2010-07-0615
 
0.9%
2020-04-0314
 
0.9%
2019-06-1313
 
0.8%
Other values (474)1448
87.9%

Length

2021-12-02T20:41:32.501235image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-12-2340
 
2.4%
2020-03-2627
 
1.6%
2010-07-1321
 
1.3%
2010-07-1419
 
1.2%
2019-06-1218
 
1.1%
2020-05-2117
 
1.0%
2020-05-2615
 
0.9%
2010-07-0615
 
0.9%
2020-04-0314
 
0.9%
2020-02-2613
 
0.8%
Other values (474)1448
87.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

state
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size13.0 KiB
NY
1647 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNY
2nd rowNY
3rd rowNY
4th rowNY
5th rowNY

Common Values

ValueCountFrequency (%)
NY1647
100.0%

Length

2021-12-02T20:41:32.601256image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-02T20:41:32.666272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ny1647
100.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2021-12-02T20:41:26.608996image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:12.657331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:14.099658image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.552986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.042339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.314622image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.652913image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.117234image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:22.471540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.839849image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.294681image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.726006image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:12.783372image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:14.256693image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.679027image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.155360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.434644image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.794945image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.242265image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:22.590567image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.962878image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.420722image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.847032image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:12.984405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:14.385735image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.807044image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.279382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.555677image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.922986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.365290image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:22.729599image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:24.219936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.547739image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.972062image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.109432image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:14.515224image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.939090image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.398405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.679692image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.049009image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.489318image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:22.872632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:24.353966image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.683769image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:27.088096image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.231478image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:14.636795image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:16.056099image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.507428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.792726image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.163028image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.600345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.001660image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:24.467005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.795795image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:27.207113image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.349500image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:14.777811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:16.182128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.622453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.909761image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.281071image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.731374image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.120700image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:24.586025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.908834image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:27.324141image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.473516image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:14.906857image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:16.305162image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.739479image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.030772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.514111image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.857401image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.237713image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:24.708549image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.024847image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:27.444174image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.596549image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.033877image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:16.430186image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.856506image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.148799image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.633126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:21.992432image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.354740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:24.824575image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.144880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:27.561206image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.728586image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.166897image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:16.556214image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:17.972540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.268827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.766156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:22.124462image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.474781image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:24.940619image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.262901image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:27.788263image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.850618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.293927image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:16.793272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.086560image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.388870image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.885195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:22.242488image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.590794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.059635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.377934image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:27.908273image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:13.968627image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:15.426957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:16.916295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:18.198584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:19.514881image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:20.999208image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:22.355521image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:23.717822image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:25.173655image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-02T20:41:26.490952image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-12-02T20:41:32.724285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-12-02T20:41:32.927349image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-12-02T20:41:33.130370image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-12-02T20:41:33.336423image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-12-02T20:41:33.517451image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-12-02T20:41:28.135336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-02T20:41:28.572423image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

CIDunique_keyagencyagency_namecomplaint_typestatuscityboroughincident_ziplatitudelongitudeCreated_YearCreated_MonthCreated_Month_NumberCreated_DayCreated_DateClosed_YearClosed_MonthClosed_Month_NumberClosed_DayClosed_Datestate
0045244842DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedRidgewoodQueens1138540.704753-73.8849782019December12252019-12-252020January132020-01-03NY
1117573251DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedBrooklynBrooklyn1120840.672582-73.8775432010July732010-07-032010July762010-07-06NY
2243202583DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedBaysideQueens1136040.774356-73.7803712019July752019-07-052019July792019-07-09NY
3317574216DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedBrooklynBrooklyn1122340.606472-73.9832612010July732010-07-032010July7222010-07-22NY
4445245555DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedBelleroseQueens1142640.733696-73.7220792019December12262019-12-262020March3312020-03-31NY
5545245562DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedBronxBronx1046640.891522-73.8538872019December12262019-12-262020September9252020-09-25NY
6639498033DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedBrooklynBrooklyn1120840.685603-73.8774252018June6192018-06-192019October10202019-10-20NY
7717574396DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedBaysideQueens1136040.783658-73.7770832010July732010-07-032010September9102010-09-10NY
8817574540DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedFlushingQueens1135840.769916-73.7911012010July732010-07-032010July762010-07-06NY
9944067986DPRDepartment_of_Parks_and_RecreationDamaged_TreeClosedNew_YorkManhattan1002540.801620-73.9625402019October10162019-10-162019October10172019-10-17NY

Last rows

CIDunique_keyagencyagency_namecomplaint_typestatuscityboroughincident_ziplatitudelongitudeCreated_YearCreated_MonthCreated_Month_NumberCreated_DayCreated_DateClosed_YearClosed_MonthClosed_Month_NumberClosed_DayClosed_Datestate
1637163742729347DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedBrooklynBrooklyn1121640.673885-73.9553082019May5202019-05-202020March3132020-03-13NY
1638163843494341DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedBrooklynBrooklyn1123840.674828-73.9668012019August882019-08-082019August8122019-08-12NY
1639163943494342DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedBrooklynBrooklyn1122840.623988-74.0089012019August882019-08-082019September932019-09-03NY
1640164042730769DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedJackson_HeightsQueens1137240.750911-73.8714472019May5212019-05-212020April472020-04-07NY
1641164142732045DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedFar_RockawayQueens1169140.593654-73.7532042019May5212019-05-212019October1032019-10-03NY
1642164242732088DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedNew_YorkManhattan1003640.761185-73.9932172019May5212019-05-212019June6192019-06-19NY
1643164342734647DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedOakland_GardensQueens1136440.747261-73.7694632019May5212019-05-212019August8122019-08-12NY
1644164445875454DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedStaten_IslandStaten_Island1031440.600315-74.1644072020March3222020-03-222020December12232020-12-23NY
1645164545875899DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedStaten_IslandStaten_Island1030140.645497-74.0857132020March3222020-03-222020March3242020-03-24NY
1646164642813797DPRDepartment_of_Parks_and_RecreationIllegal_Tree_DamageClosedNew_YorkManhattan1002740.812363-73.9614432019May5302019-05-302019July782019-07-08NY